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Abstract —Parametric filters, such as the Extended Kalman 
Filter and the Unscented Kalman Filter, typically scale well with 
the dimensionality of the problem, but they are known to fail if 
the posterior state distribution cannot be closely approximated 
by a density of the assumed parametric form. 

For nonparametric filters, such as the Particle Filter, the 
converse holds. Such methods are able to approximate any 
posterior, but the computational requirements scale exponen¬ 
tially with the number of dimensions of the state space. In 
this paper, we present the Coordinate Particle Filter which 
alleviates this problem. We propose to compute the particle 
weights recursively, dimension by dimension. This allows us 
to explore one dimension at a time, and resample after each 
dimension if necessary. 

Experimental results on simulated as well as real data con¬ 
firm that the proposed method has a substantial performance 
advantage over the Particle Filter in high-dimensional systems 
where not all dimensions are highly correlated. We demonstrate 
the benefits of the proposed method for the problem of multi¬ 
object and robotic manipulator tracking. 

I. Introduction 

Decision making requires knowledge of some variables of 
interest. In the vast majority of real-world problems these 
variables are latent, i.e. they cannot be observed directly 
but have to be inferred based on sensor measurements. If 
decision making has to be performed online, these latent 
variables have to be inferred online as well. Therefore, past 
measurements have to be fused with incoming measurements 
to always maintain an up-to-date belief over the latent 
variables. This problem is called filtering and the underlying 
dynamical system is typically modeled with a State Space 
Model (see Fig. 1). 



Fig. 1. The belief network which characterizes the evolution of the state 
x and the observations y. 
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More formally, filtering means inferring the state a; at a 
given time, knowing only the measurements y up to that 
time. Applications range from robotics, over estimating a 
digital communication signal using noisy measurements, to 
estimating the volatility of financial instruments using stock 
market data [4], Often these systems have high dimensional 
state spaces. For example, the full joint state of a humanoid 
robot is typically over 50 dimensional. One of the most 
popular algorithms for performing inference in dynamical 
systems, the Particle Filter (PF), breaks down in such high 
dimensional systems [7, 2], In this paper, we address this 
issue and propose a novel filter called the Coordinate Particle 
Filter (CPF) which scales well with the dimensionality of the 
state for systems where only a subset of the state dimensions 
is highly correlated. 

Notation and Problem Statement 

We assume the underlying process of the dynamical sys¬ 
tem to be stationary. Therefore the absolute time indices are 
irrelevant, only the time difference matters within a figure or 
equation. To simplify notation we will thus use the indices 
0,1, 2 throughout the paper. Furthermore, we introduce the 
notation : t to denote all the time steps up to t. 

The system is described by two functions, the process 
model g and the measurement model h. 

X2=g{Xl,V 2 ) (l) 

y 2 =h{x 2 ,w 2 ) (2) 

Without loss of generality, the noise variables v and w can 
be modeled as normally distributed with zero mean and unit 
variance, since they can always be mapped onto different 
distributions inside of the process and measurement models. 

In filtering, the distribution of interest is the current belief 
given all the measurements taken so far p{x 2 \y. 2 ). It can be 
computed recursively as follows 

p{x 2 \y [2 ) ex p(y 2 \x 2 ) / p{x 2 \x\ = x)p(xi = x\y-.i) ( 3 ) 

■<x 

which is the well known Bayes filter. Only in very few 
cases can this integral be solved analytically, there are thus 
numerous approximation methods, the most prominent ones 
being the Extended Kalman Filter (EKF), the Unscented 
Kalman Filter (UKF) and the Particle Filter. 

The EKF is known to fail if the system exhibits substantial 
non-linearity [3], Many algorithms have been proposed to 
improve the performance of the EKF, most prominently the 
Unscented Kalman Filter (UKF) [6], The UKF has been 
applied successfully in many settings where the posterior 







distribution can be closely approximated by a Gaussian. In 
nonlinear systems this assumption can be violated, which 
often precludes the usage of a UKF [3], 

These limitations have led to a big interest in alternative 
methods which can represent a bigger family of dynamical 
systems. Sequential Monte Carlo, i.e. Particle Filtering meth¬ 
ods, started to be used widely since the seminal work by 
Gordon et al. [5], These methods are applicable in general 
state-space models and allow the computation of all kinds 
of moments, quantiles and highest posterior density regions 
[3], whilst the EKF and UKF approximate only the first 
and second order moments. The Particle Filter has found 
applications in practically all areas of signal processing 
concerned with Bayesian dynamical models [3], such as 
signal processing, control, robotics, finance, and statistics. 

Unfortunately the PF is not well suited for high dimen¬ 
sional systems. The number of particles N has to scale 
exponentially with the number of dimensions D in order 
to prevent the filter from failing [7, 2], We present a method 
to alleviate this problem in Section III. 

II. Monte Carlo and the Curse of 
Dimensionality 

In this section we will briefly review some basic Monte 
Carlo methods and subsequently explain the Particle Fil¬ 
ter, for a more extensive discussion we refer the reader 
to [5, 3, 1, 10]. The objective in Particle Filtering is to 
approximate expectations with respect to the posterior from 
Eq. 3, p(x 2 |z/: 2 )-. using a set of samples {x^}- If we draw a set 
of samples {x? 2 } ~ p{ x -. 2 \y-. 2 ) we can always simply discard 
{X-i} in order to obtain a set of samples {X 2 } ~ p{ x 2 \y-. 2 j- 
There is therefore no need to integrate out previous states 
and Eq. 3 becomes 

p(x.. 2 \y-. 2 ) tx p(y 2 \x 2 )p{x 2 \xi)p(x. A \y :l ). (4) 

We will thus be concerned with drawing, or approximately 
drawing samples {x\ 2 } ~ p(.x, 2 \y :2 ). 

Firstly we will recall some basic notions of sampling 
which are needed to follow the subsequent derivations. For 
more details, please refer to [8, 10, 1]. 

A. Simple Monte Carlo 

We are interested in approximating expectations using 
sampling. That is, we want to find a set of samples {x 1 }, 
such that 

[ f{x)p( x = X)~ tYI /(**) (5) 

^ 1=1 

i.e. the right hand side is a good approximation for the left 
hand side. 

It is easy to show that, if the samples {x z } are drawn 
from p(x), then Eq. 5 provides a consistent estimator, which 
means that for the number of samples L —> 00 the two 
sides become equal [8, 1]. Therefore, given enough samples, 
we can compute the mean, the covariance or any other 
property of the distribution p(x). The standard deviation 


of the estimate is proportional to -^=, independently of the 
dimension. 

The discussion above implies that, if we were able to 
sample efficiently from the posterior from Eq. 4, p(x :2 \y :2 ), 
then we could approximate the expectations of interest even 
in high dimensional systems with no need for an excessive 
number of samples. Unfortunately, in the vast majority of the 
cases of interest, this distribution is highly complex and it is 
impossible to sample from it directly [8, 1, 4]. Therefore 
more sophisticated schemes have to be used to evaluate 
expectations in most real world problems. 


B. Importance Sampling 

Importance Sampling (IS) is such a scheme which can be 
used when it is impossible to sample from p(x). Since it is 
not possible to generate samples from the random variable 
x we introduce an auxiliary random variable ip which is dis¬ 
tributed according to the so called proposal distribution. This 
distribution is chosen such that we can easily draw samples 
from it. To use these samples for computing expectations 
with respect to x, they have to be weighted to account for 
the mismatch between the distributions of x and ip. 

To find these weights we expand Eq. 5 as 


f(x)p(x = X) 


4 f(x)p( x = X) 

4 p( x = x) 

J x f(x)$^jP(P = x) 

4 = 

4 f(x)u(x)p(<P = X) 

4 = x) 


where we have defined ui(x) oc ^4_Xy. Very importantly, 

w(x) can be any function which is proportional to p^J^ , 
since it appears in the numerator as well as the denominator 
and any constant will canceled. This means that we need to 
know p(x) and p(<p) only up to a normalization constant. 
The distribution p(ip) can be chosen freely, as long as its 
support contains the support of p[x). Choosing p(ip) such 
that we can sample from it, we can apply simple Monte Carlo 
to approximate both the numerator and the denominator 
separately 


f (x)p( x = x) 


Ef=i/(xW) 

Ef=iw(x J ) 


with lx*} ~ p(<p). It is easy to show that this estimator 
converges to the true expectation with probability equal to 
1 as L —► 00 [8]. The standard deviation of the estimator 
is again proportional to but, as opposed to simple 
Monte Carlo, it now also depends heavily on the proposal 
distribution p(tp). There will necessarily be some mismatch 
between the proposal and the desired distribution, which 
leads to the variance of the estimator growing exponentially 
in the number of dimensions of the state space [1, 8]. This 
high variance is a consequence of weight degeneracy, i.e. 
almost all of the weight being concentrated on a very small 











Fig. 2. The belief network describing the evolution of the random process 
of interest x and another artificial random process ip. 


subset of the samples. 

Let us now apply IS to the problem we are trying to 
address, which is computing expectations with respect to 
p(x, 2 \y, 2 ). Since we cannot sample from this distribution 
directly, we define an alternative process p(p-. 2 \y-. 2 ) which 
approximates the actual process, see Fig. 2. Having chosen 
the proposal distribution we have to compute the weights in 
order to account for the mismatch between the distribution 
of interest and the proposal distribution. 


W 2 (X:2) OC 


P{.X ,2 = X:2 |Z/:2 ) 
P{<P:2 = X:2 12/:2) 


(6) 


In dynamical systems, these weights can be computed se¬ 
quentially, which is called Sequential Importance Sampling 
(SIS). 


C. Sequential Importance Sampling 

Fig. 2 shows both the process x we are trying to estimate 
and the auxiliary process <p which we created. When sam¬ 
pling, we have knowledge of all the past observations and 
the past samples we generated, therefore (p 2 can depend on 
all the preceding ip and y. 

To compute the weights sequentially, we have to write both 
the denominator and the numerator of Eq. 6 recursively. The 
numerator can be computed recursively according to Eq. 4. 
The denominator, i.e. the proposal distribution, can be written 
recursively as well. Using the independence assumptions 
from the belief network from Fig. 2 we obtain 


p{<p-. 2 \y-. 2 ) = p(v 2 W-.\,y-. 2 )p{w:i\y-.i)- (7) 


Using this result and Eq. 4, Eq. 6 can be re-written as: 

, x P(y2\x 2 =X2)p{X2 = X'2^1 = Xl) , X , Q x 
W 2 (X:2) OC - - -;-C-Wi(X:l). (8) 

PVP 2 = X2 ¥>:1 = X:l,y-.2) 


The proposal distribution can be chosen freely, a very com¬ 
mon choice is p(ip 2 = X 2 \v-.i = X-.uVa) = p{x 2 = X 2 I 2 T = 
Xi), the above recursion then simplifies to 


W 2 (X: 2 ) OC p(y 2 \x 2 =X 2 )wi(X:l)- 


Resuming, SIS provides a way of computing the weights re¬ 
cursively. The final weights will however be exactly the same 


as in standard IS. SIS thus suffers from weight degeneracy in 
the exact same way as IS does. As discussed above, due to 
the unavoidable mismatch between proposal distribution and 
distribution of interest, the number of particles has to grow 
exponentially in the number of dimensions. The dimension 
of the sample X :2 is equal to DT where T is the number 
of time steps and D is the dimension of the state. Since 
DT is typically very high and growing in time, computing 
with an exponential number of particles is intractable, and 
weight degeneracy thus unavoidable. The solution proposed 
in [5] is to discard particles with low weights, which will not 
contribute significantly to the computation of the expectation 
anyways, by resampling. This algorithm is well known as the 
Particle Filter. 

D. The Particle Filter 

In Particle Filtering, we use the standard SIS algorithm 
as long as the weights are not degenerate. As soon as 
they become too concentrated, according some criterion, we 
resample. The most common way of resampling is to redraw 
each particle with a probability proportional to its weight. 
Any resampling strategy which does not bias the estimator 
can be used [3], 

Since in a Particle Filter we can resample after each time 
step, the exploration space has thus effectively been reduced 
from TD to D dimensions. 

If D is small enough such that the state space can be 
explored reasonably well by N samples, the problem of 
weight degeneracy is resolved. In many systems however, 
the dimension of the state space D is too large large to 
be covered by any tractable number of particles. Therefore, 
Particle Filtering in high dimensional state spaces remains 
an open problem [7, 2, 8], 

III. Proposed Method 

The key idea of the Particle Filter is to compute the 
weights recursively in time, such that it is possible to 
resample after each time step. Here we extend this idea and 
propose to compute the weights not just recursively in time, 
but also in the dimensions of the state space, which will 
allow for a resampling step after each dimension. Hence, the 
dimensionality of the space explored in one step is reduced 
from TD to D through the usage of a standard Particle 
Filter and we propose to further reduce it to 1. The presented 
method is referred to as the Coordinate Particle Filter (CPF), 
since it is reminiscent of Coordinate Descent. 

More concretely, we will inject the noise v 2 dimension by 
dimension into the process described by Eq. 1 and update the 
weights after each step. We are therefore required to write 
the weights recursively in the dimensions of the noise. 

A. Explicit noise 

To facilitate the subsequent derivation of the proposed 
method, we formulate the standard Particle Filter in terms of 
the noise variables v. Making the process noise explicit in 
Fig. 2, we obtain Fig. 3. Since the only source of uncertainty 
in the state trajectory is the process noise v, knowledge of 



















Fig. 3. This is the same system as in Fig. 2, with explicit process noise v. 
The measurement noise w is still implicit. 


the noise trajectory v : 2 implies knowledge of the current 
state X 2 . Therefore v -2 is a valid representation of the state 
X 2 of the dynamical system. In practice, it would of course 
be very inefficient to store the state as the noise trajectory, 
but conceptually we can use v :2 and x 2 interchangeably. 
Applying this substitution to Eq. 4 we obtain 

P(V: 2 \y-. 2 ) OC p{y2\V:2)p(v 2 )p(V: 1 \y-. 1 ) (9) 


where we have made use of the fact that the process noises at 
different time steps are independent. Similarly, we can write 
the weights in Eq. 8 in terms of the noise variables. 


U) 2 (v-.2) OC 


P{V2\V,2 = V,2)p(v 2 = V 2 ) , , 

-j-r w l (Ul) 

p{lfi 2 = = Ul,2/:2) 


( 10 ) 


This somewhat unusual formulation is equivalent to the 
standard Particle Filter and it will provide a basis for the 
proposed extension. 


B. Computing the weights recursively in the dimension 

The above recursion has the form ui 2 (^ 2 ) oc 

where we omit the dependency on y for 
simplicity. The current weights are obtained by multiplying 
the previous weights with some function of the extended 
noise trajectory, subsequently they can be resampled if 
necessary for preventing weight degeneracy. However, if 
the dimension of the problem is very large, the weights can 
already be degenerate after just one time step. 

We therefore go one step further and write the weights 
recursively in the dimensions d as well. More precisely, we 
want to find the weight of the noise trajectory {iai, i4 :d } 
up until dimension d as a function of the weight of the 
noise trajectory {v.i, i4 :d-1 } up until dimension d—1. To 
keep notation uncluttered we define u: 2 = { u : 1 , v 2 d }, which 
denotes all the past noises ! and the current noise v 2 d up 
until dimension d. Furthermore it will be convenient to use 
the convention i/: 2 = ix[ > = v- 1 . Using this notation we can 


write the objective of this section as finding a recursion of 
the form u 2 {v\ 2 ) x f{ u \ 2 ) w 2 -1 ( t ': 2 _1 )- 

Similarly to Eq. 6, the weights are defined as the ratio 
between the desired and the proposal distribution. 


w 2 (v : 2 ) oc 


P{V \2 = >':2 V l) 

P(<P\ 2 = ":2 //: J ) 


(ID 


We will inject the noise at time 1 dimension by dimension, 
then the noise at time 2 dimension by dimension and so on. 
Two recursions are therefore required, one in time and one 
in the noise dimension. 

1) Time Recursion: The time recursion has to express the 
current weight cv 2 as a function of the previous weight utf. 
We use the independence assumptions implied by Fig. 3 to 
express both the numerator 


p(v%\y-.2) =p(v[i\y-.2) oc p(y 2 \v\i)p(v:i \y : i) ( 12 ) 


and the denominator 


p{v>\ 2IZA2) = p(p;f \y-.2) OC p(<p\?\y :1 ). ( 13 ) 


Finally, we insert these two equations into Eq. 11 for 
dimension d = 0 to obtain 


^2(^2) ocp(t/ 2 ki = v\i H (u: A ). 


(14) 


This equation defines the time recursion, it incorporates the 
measurement at time 2, without extending the noise trajectory 
yet. 

2) Dimension Recursion: To extend the noise trajectory 
we write the numerator and denominator of Eq. 11 recur¬ 
sively in the dimension d. Using the independence assump¬ 
tions implied by the belief network in Fig. 3 we can rewrite 
Eq. 9 to obtain a recursive expression for the numerator of 
Eq. 11 


p( v 'i\y-.2) ocp(t>2|u. : 2 \m)pW£ 1 \y-.2) 


p{v2\v'i)p{v^) 

) 


pWt 1 


p(y \2 \y-.2) (15) 


Similarly we can rewrite the sampling dynamics from Eq. 7 
to obtain a recursive expression for the denominator 


p(.vi\y-.2) ocpOslv^ \2/:2 )pO- 2 1 \y-.2)- (16) 


Inserting Eq. 15 and Eq. 16 into Eq. 11 yields the equation 
for injecting the noise dimension by dimension 


w 2 (u 2 ) 


P(P2k:2 = U 2 ) 

p{y2 Wt x = vit 1 




( 17 ) 


where we have defined 


■= 


P{V d 2 = 4) 


p{'ti = '4\vt 1 = vT L ,v*) 


d—l 


3) Algorithm: Eq. 14 and Eq. 17 enable us to update the 
weights dimension wise, as desired: 

• We start out with the previous weight 

• We apply Eq. 14 to obtain 

• We iteratively apply Eq. 17 to obtain 




























Writing out the above algorithm we obtain 


°c 


p(y2\v\? = v'£)p{v'2 D = v'i D ) D 


P{P 2 = v 2 \<P\l = U \l .J/:2) 


<(*1 ) 


which is equivalent to the weight update for the Particle Filter 
from Eq. 10. Since the denominator of the fraction in Eq. 17 
is equal to the numerator of the previous time step, all the 
intermediate terms p(y 2 \v:$ = v':^) with d < D cancel each 
other out. 

This is no surprise since we merely divided up the Particle 
Filter equations into smaller steps, but after iterating over 
all dimensions we will still obtain the same result for the 
weight. If there is no resampling in the CPF within a time 
step, it will yield the exact same result as the PF. Differences 
appear however when resampling is required within time 
steps, before all noise dimensions have been incorporated. 
The CPF will in that case discard unlikely samples early, 
which is not possible with the PF. 


C. Approximations 

Unfortunately, not all terms in Eq. 14 and Eq. 17 are 
easy to compute. The terms </>f(tA : f) are unproblematic, since 
the numerator is the process noise distribution, which is 
Gaussian, and the denominator is the proposal distribution, 
which we are free to choose. 

The terms of the form p{y 2 can however in most 
cases not be obtained in closed form, except of course for 
p(y 2 |u. : 2 > ), which is simply the likelihood and also occurs in 
the standard Particle Filter, see Eq. 10. The terms for d < D 
however require computing the integral 

p{y 2 \v : %) = f P{y2\v'£)p{v$ +1:D ) (18) 

Jv2 + 1 ' D 

which in most cases cannot be done in closed form. Below 
we propose an approximation which is feasible for any 
dynamical system, it is however very important to note that 
all the approximate terms p(xj 2 u : : 2 ) for d < D cancel each 
other out, as shown above, and the full weight u? 0;?) is 
exact. 

Consequently, if no resampling in between full samples 
is required, i.e. we have enough samples to cover the D 
dimensions of the state space, then the proposed algorithm 
is equivalent to the standard Particle Filter, no matter what 
approximation is used. If there is resampling in between full 
samples, then of course the approximation matters. 

If the approximation provides a good indication as to 
whether the full sample will have a large weight, then we 
can expect the proposed method to work well. If however the 
approximation is poor, then resampling based on the partial 
weights can lead us to discard good candidates. To shed some 
light on these issues, we will compare the performance of 
the proposed method with the standard PF under different 
conditions in the experimental section. 

Dirac Approximation: There are many possible ways of 
approximating the integral from Eq. 18, the purpose is to 
have at least a rough idea of the likelihood of a sample before 
sampling all the dimensions. One option is to approximate 



Fig. 4. The covariance shape varies from a sphere (p = 0.0, no correlation) 
to a line (p = 1.0, maximum correlation). 


the process noise p(v 2 +1D ), which is a standard normal 
distribution, by a Dirac distribution centered at zero, the 
solution to the integral is then easy to obtain: 

p{y 2IC2) = p{y2\v$,v$ +1:D = 0). 

This very simple approximation worked well in most of our 
experiments and is applicable to any system model. 


IV. Application to a Linear Gaussian System 

Let us consider a simple linear Gaussian system in order 
to illustrate the proposed method. The state of this system 
could of course be inferred optimally using a Kalman filter. 
Nonetheless it will provide us with some insight into the 
characteristics of the proposed filter. 

For convenience, we write the process model in functional 
form 


X 2 = X\+ V 2 

where is drawn from a standard normal distribution. The 
observation model is given as a distribution: 

P{y2\x 2 ) = N(y 2 \x 2 ,Q) 

For simplicity we will consider a system in which all the 
measurement variables have a variance equal to 1. Further¬ 
more we assume that the Pearson correlation between the 
different measurement dimensions y l and y :1 is equal to p 
for all i,j. All the diagonal elements of Q will thus be equal 
to one, and all the off diagonal elements equal to p. Q has 
D eigenvalues A d where Ai = 1 — p for i = 1 • • • D — 1 
and Xu = (D — \)x + 1. This scaling is visualized for a 
two-dimensional system in Fig. 4. Varying p will allow us 
to investigate how the proposed method behaves depending 
on how correlated different dimensions are. The second 
parameter we will vary is the dimension D of the system. 
In the following, we compare the standard Particle Filter 
with the proposed exact and approximate Coordinate Particle 
Filter. 






A. Particle Filter 


We choose the proposal distribution according to our 
process model 

p(v 2 = X2\p-.l = X-.WV-.i) = N {X2 \XiJ) 

the importance weights from Eq. 8 then become 

W2(X:2) OC N(y 2 \x2,Q)ul{X:l) 

that is, we simply re-weight according to the observation 
model. 

B. Exact Coordinate Particle Filter 

First of all we will make the noise explicit. Our system is 
thus described by p(y 2 \v- 2 )- This distribution can be obtained 
from the process and observation model: 

P{y 2 \v\%) = N(y 2 \J2 v i D ,Q)- 

t 

To be able to compute the partial weights, we have to solve 
the integral in Eq. 18. For this very simple system, this 
integral can be solved analytically, which means that the 
partial weights in Eq. 14 and Eq. 17 can be computed without 
any approximation. We will refer to the filter using these 
weights as the exact Coordinate Particle Filter. 

C. Approximate Coordinate Particle Filter 

In most cases we will not be able to solve Eq. 18 
analytically, we will thus apply the approximation method 
described above, where p(v$ +1:D ) is approximated by a 
Dirac spike, for comparison. We refer to the resulting filter 
as the approximate Coordinate Particle Filter. 


0.9 
0.8 
0.7 
0.6 
0.5 
0.4 
0.3 
0.2 
0.1 

1 2 5 10 20 40 80 

Degrees of Freedom 

Fig. 5. The probability of the error of the exact CPF being smaller than 
the eiTor of the PF. 
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Fig. 6. The probability of the error of the approximate CPF being smaller 
than the error of the PF. 


CPF. This trend is not surprising, since the more correlated 
the dimensions of a system, the harder it is to explore each 
dimension independently. 


D. Experimental Setup 

For each filtering algorithm we run 10 simulations per 
parameter set. At each time step we compute the root 
mean squared (RMS) error between the estimate by the 
filter and the true state. Finally, we compute the mean and 
variance of the RMS-error across the time steps and runs. We 
therefore end up with an error mean and variance per filter 
and parameter set, which can be interpreted as a Gaussian 
distribution over the error. To compare the CPF and the PF 
we use these two distributions to compute the probability of 
the error of the CPF being smaller than the one of the PF. 

E. Results 

In Fig. 5 we compare the exact CPF with the PF for 
different numbers of degrees of freedom and correlations. 
For dimension 1 and 2 both work well, there is thus not 
much difference, i.e. the probability of the exact CPF having 
a smaller error is close to 0.5. For the rest of the Fig. 
a clear trend can be seen: For high correlation and low 
dimensionality the standard Particle Filter performs better, 
and with increasing dimension and decreasing correlation the 
exact CPF gains more and more advantage. A similar trend 
can be seen for the approximate CPF in Fig. 6, with the 
difference that the performance of the approximate method 
compares a little bit less favorably to the PF than the exact 
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Fig. 7. The error in function of the correlation for a fixed dimension 
D = 5. 


In Fig. 7 we plot the error as a function of the correlation, 
this graph represents the same data as one column of the 
heat maps above. The performance of the approximate CPF 
degrades very quickly due to the crude approximation made. 

In Fig. 8 we plot the error as a function of the dimension¬ 
ality. The performance of the Particle Filter degrades much 
quicker than the performance of the proposed methods. 

The insight from this experiment is that in very high¬ 
dimensional systems the PF will perform poorly, whereas 
the proposed method can perform well if the dimensions are 
not too strongly correlated. 

















B. Simulation 
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Fig. 8. The error in function of the number of degrees of freedom for a 
fixed correlation of 0.4. 


V. Application to Tracking using Range Images 

We apply the Coordinate Particle Filter to 3D object 
tracking using an RGB-D camera, as proposed in [11], The 
state consists of the 3D poses of the objects of interest and 
the observations are range images. The observation model 
predicts the measurement to be in the proximity of the 
rendered 3D object model. 

For evaluation we track several independently moving 
objects with the proposed Coordinate Particle Filter as well 
as with a standard Particle Filter. We could break the system 
down into low dimensional sub systems and track each object 
separately, which might work well since they are only weakly 
interacting. The objective of this experiment is however to 
understand whether the proposed method can exploit the 
structure of the system autonomously. This is important, 
since in different dynamical systems the algorithm might 
be able to exploit the structure in a way which is not as 
obvious as in this case. Furthermore, this setup allows for 
a straight forward analysis of the scalability with respect to 
the dimensionality of the state space, simply by increasing 
the number of objects. 

A. Experimental Setup 

We conducted experiments both in simulation and on 
real range data. In both cases we continuously move boxes 
while tracking their poses. To investigate scaling with the 
dimensionality of the state, we evaluate the algorithm with 
1, 3, 6, 9 and 12 moving boxes. Each box has 6 degrees 
of freedoms which need to be estimated. Since both the 
Particle Filter and the proposed Coordinate Particle Filter 
are sampling based, we run each experiment ten times on 
the same dataset in order to collect some statistics. 

The Coordinate Particle Filter requires N cp fD evaluations 
of the likelihood per time step, where N cp f is the number 
of particles and D is the number of degrees of freedom. 
The standard Particle Filter requires only N p f evaluations 
per time step, where N p f is the number of particles. For fair 
comparison we ensure that both filters use the same number 
of evaluations. Therefore we set N cp f = -yy and for all 
experiments we have N p f = 2000. 


The range data is generated from a simulation that ex¬ 
hibits similar artifacts as a real RGB-D camera (occlusion 
boundaries, quantization steps, perlin-like noise) 1 . For the 
evaluation of the accuracy we compute the root-mean-square 
error (RMSE) between the estimated pose and the true 
pose, averaged across all time steps. In Fig. 9 we plot 
the mean and standard deviation of the RMSE across the 
ten runs. For 6 and 18 dimensions there seems to be no 
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Fig. 9. The proposed CPF is compared to the standard PF for different 
numbers of degrees of freedom on simulated range data. The dimensionality 
of the state does not affect the performance of the proposed method, the 
standard PF however degrades rapidly. 
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significant difference between the CPF and the standard PF. 
As expected, the standard PF starts degrading very rapidly for 
higher dimensional states, whereas the CPF seems to be able 
to exploit the fact that there are only few strong dependencies 
between the dimensions of this dynamical system. 

C. Real Data 

To confirm the validity of the simulation based experiment, 
we run a very similar experiment with real range images from 
an RGB-D camera (cf. Fig. 11). We record a scene in which 
the corresponding number of boxes are manually moved. In 
this setup no ground truth is available. For the evaluation, 
we assume that the ground truth poses are equal to the mean 
poses across all individual runs on the same dataset. Thus, 
in contrast to the simulation based experiments, the errors 
plotted in Fig. 10 represent the variance of the estimate. 
Nevertheless, Fig. 10 conveys the same development as 
the simulation based experiments, shown in Fig. 9. This 
suggests, that the simulation experiments are indeed a good 
indicator for the performance on real data. 

VI. Application to Manipulator Tracking 

In this section, we apply the proposed method to manipu¬ 
lator tracking. We simultaneously track the 30 joint angles of 
the two arms, including the fingers, of the robot in Fig. 12. 
The observation model is the same as described above, but 
now the different objects are not moving freely anymore but 
are constrained through the robot’s kinematic chain. Unlike 
in the multi object tracking example above, this system could 

1 The source code is available at https://github.com/jbohg/ 
render_kinect. 
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Fig. 10. The CPF is compared to the standard PF for different numbers 
of degrees of freedom on real range data from an RGB D camera. 




Fig. 11. Real point cloud displayed on the left and the corresponding range 
image. 


not easily be broken down into several weakly interacting 
subsystems which can be tracked individually. Nevertheless, 
as we can see in Fig. 13, the proposed method outperforms 
the standard Particle Filter on simulated range data. In the 
accompanying video, we show qualitative results for tracking 
the real robot arm. 

VII. CONCLUSIONS 

To sufficiently cover the state space, the standard Particle 
Filter requires the number of samples to grow exponentially 
in the dimensionality of the state space. This precludes its 
usage in high dimensional systems. We propose a method 
called the Coordinate Particle Filter (CPF), which computes 
the approximate weights recursively in both the time steps 
as well the degrees of freedom of the state, as opposed to 
the standard PF which is recursive only in the time steps. 
Therefore the CPF can resample per dimension before even 
drawing a full state sample. This prevents the particle weights 
from becoming degenerate even in high dimensional state 
spaces. 

Experimental results indicate that the proposed method 
is able to automatically exploit the structure of dynamical 
systems where not all dimensions are strongly dependent 
on each other. It can therefore perform well in very high¬ 
dimensional systems which are beyond the scope of a stan¬ 
dard Particle Filter. 

An open question is what kind of dependency structures 
the proposed method is able to exploit. Another interesting 



Fig. 12. The robot model is represented in white, and the simulated range 
data in blue. The robot consists of two WAM arms with hands by Barrett [9]. 



Fig. 13. The CPF is able to track the robot motion well despite the 
high dimensionality. The RMS error is computed between the estimated 
and ground-truth 30-dimensional joint configuration of the two robot arms. 


direction of future research is to investigate the importance 
of the order in which dimensions are explored, and if an 
optimal order could be determined automatically. 

References 

[1] D. Barber. Bayesian Reasoning and Machine Learning. Cambridge University 
Press, 2012. 

[2] P. Bickel, B. Li, and T. Bengtsson. Sharp failure rates for the bootstrap particle 
filter in high dimensions, volume Volume 3 of Collections, pages 318-329. 
Institute of Mathematical Statistics, Beach wood, Ohio, USA, 2008. 

[3] O. Cappe, S. Godsill, and E. Moulines. An overview of existing methods and 
recent advances in sequential Monte Carlo. 95(5):899-924, May 2007. 

[4] A. Doucet, N. de Freitas, and N. Gordon. An introduction to sequential Monte 
Carlo methods. In A. Doucet, N. de Freitas, and N. Gordon, editors, Sequential 
Monte Carlo Methods in Practice, Statistics for Engineering and Information 
Science, pages 3-14. Springer New York, 2001. 

[5] N. J. Gordon, D. J. Salmond, and A. F. Smith. Novel approach to nonlinear/non- 
Gaussian bayesian state estimation. Radar and Signal Processing, IEE Proceed¬ 
ings F, 140(2): 107-113, April 1993. 

[6] S. J. Julier and J. K. Uhlmann. A new extension of the Kalman filter to nonlinear 
systems. Int. Symp. Aerospace/Defense Sensing, Simulation and Controls, pages 
182-193, 1997. 

[7] B. Li, T. Bengtsson, and P. Bickel. Curse-of-dimensionality revisited: Collapse 
of importance sampling in very large scale systems. 

[8] A. B. Owen. Monte carlo theory, methods and examples (book draft), 2013. 
Available at http: //www-stat. stanford.edu/owen/mc/. 

[9] L. Righetti, M. Kalakrishnan, P. Pastor, J. Binney, J. Kelly, R. C. Voorhies, 
G. Sukhatme, and S. Schaal. An Autonomous Manipulation System based on 
Force Control and Optimization. Autonomous Robots Journal Special Issue: 
Autonomous Grasping and Manipulation, 2013. 

[10] C. P. Robert and G. Casella. Monte Carlo Statistical Methods. Springer Texts 
in Statistics. Springer New York, 2004. 

[11] M. Wiithrich, P. Pastor, M. Kalakrishnan, J. Bohg, and S. Schaal. Probabilistic 
object tracking using a range camera. In IEEE/RSJ Inti Confon Intelligent Robots 
and Systems, pages 3195-3202, Nov 2013. 


























